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! I (JOINT  SOURCE  AND  CHANNEL  CODING* 
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^ 23  ABSTRACT.  The  advantages  and  disadvantages  of  combining  the  func- 
24  tions  of  source  coding  (’data  compression*)  and  channel  coding 
(‘error  correction*)  into  a single  coding  unit  are  considered. 

26  Particular  attention  is  given  to  linear  encoders,  both  for  sources 

27  and  for  channels,  because  their  ease  of  implementation  makes  their 
^iUse  desirable  in  practice.  It  is  shovm  that,  without  loss  of 

29  optimality,  a joint  source/channel  linear  encoder  may  be  used  when 
■ki  the  goal  is  the  distortionless  reproduction  of  the  source  at  the 
_3!_  destination.  On  the  other  hand,  it  is  shown  that  in  general  there 
32  is  an  inherent  and  significant  loss  of  optimality  if  a joint  source/ 
i."  bhannel  linear  encoder  is  used  when  the  goal  is  relaxed  to  repro- 
34  duction  of  the  source  within  some  specified  non-negligible  dis- 
3.'i  tortion.  , ; 

39  - i 

37 ; 

3,S  1.  INTRODUCTION  ' 


Our  aim  in  this  tutorial  paper  is  to  treat  the  separability 
_r ,of  the  two  basic  coding  functions  that  arise  in  communications, 

J_2  namely  source  coding  and  channel  coding,  first  in  the  general 

case  and  then  in  the  important  practical  case  when  these  functions 
J_i  are  both  linear.  We  shall  find  that  the  desirability  of  joint 

linear  source/channel  coding  is  closely  (and,  to  us,  surprisingly) 
_:•)  linked  to  the  degree  of  fidelity  specified  in  the  reconstruction 
•7  of  the  source  at  the  destination. 


Vn  *This  research  was  supported  by  the  Office  of  Naval  Research  under 
"■',7  Contract  ONR-N00014-64-C-1183 . 
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.^JL|  The  model  of  a communications  system  with  separate  source 

_j_  I and  channel  coding  is  shown  in  Fig.  1. 


j Discrete  1 
' llnformatlon 
i i)  j j Source 


Source  [ 
Encoder 


,.ni'  oi  1 i 


\ ' i 1 lisK  H 


Fii  - 1 I .me  <>!  1 1 


l.> 

! 

! G. 

V.  I 

1 3 

• • 1 ^ 

Source 

1 ^ - - - 

1 Channel  . 

IK 

1 

Decoder 

1 Decoder 

’19 

jchannel  ~|  ^ 

lEncoder  1 


Discrete 

Channel 


Fig.  1 A Digital  Communications  System  with  Separate  Source 
I and  Channel  Coding 

I 

' It.  will  be  noted  that  there  are  three  different  subscripts  on 

the  various  symbols  shown  in  Fig.  1,  namely,  1,  j,  and  k.  VJe  use 
this  artifice  to  distinguish  between  sequences  that  may  not  be 
equi-numerous  over  a long  time  inverval.  For  instance,  there  may 
be  more  source  output  digits  per  second,  say,  than  encoded  source 
digits  per  second — in  fact,  we  hope  that  there  are  many  more  so 
:that  the  source  encoder  is  doing  well  its  task  of  "data  compres- 
'sion".  Also  for  instance,  there  may  be  fewer  encoded  source 
digits  per  second  than  encoded  channel  digits  per  second — we  may 
be  forced  into  this  situation  by  the  need  to  insert  redundancy 
^into  the  channel  input  digits  so  that  the  channel  decoder  can  do 
well  its  task  of  "error  correction". 


; Roughly  speaking,  we  may  use  the  terms  "source  coding",  "data 

compression",  and  "redundancy  removal"  as  synonymous.  Again  rough- 
ly speaking,  we  may  use  the  terms  "channel  coding",  “error  cor- 
'rection",  and  "redundancy  insertion"  as  synonymous.  A wag  might 
accuse  the  International  Brotherhood  of  Information  Theorists  of 
featherbedding:  it  provides  jobs  for  those  who  take  out  redun- 

dancy and  jobs  for  those  who  put  redundancy  back  in,  at  least 
when  source  coding  and  channel  coding  are  performed  separately  as 
shown  in  Fig.  1.  But  it  is  a serious  question  to  ask  whether  one 
box,  a "joint  source/channel  encoder"  as  shown  in  Fig.  2,  couldn’t 
do  a better  job  (or  at  least  do  the  same  job  more  economically) 
than  does  the  tandem  combination  of  the  "source  encoder"  and  "chan- 
nel encoder"  boxes  in  Pig.  1.  As  we  shall  soon  be  seeing,  this 
simple  question  has  a rather  complicated  answer. 


RFST'AVAIUBirCOPY 


L I 


. I . .'.u  , 


.\  Ini'  f.  I’l.Mi  ' .V  . li\ 


Firsr  Line  u!  Title 


Authors’  N;inie-> 
Authors’  Atl'iliutiotis 


F'irsi  i.'iir  ol  1 1 I 


In  fa^t,  one  of  the  important  results  in  Shannon's  celebrated 
'■  1948  paper  was  his  demonstration  that  the  source  and  channel  cod- 
A. ! ing  functions  are  fundamentally  separable  in  the  sense  that,  with- 
JL 'out  loss  of  efficiency  in  the  use  of  a given  channel  to  transmit  j 
_L_  a given  source  i 


26 


27  with  some  specified  fidelity  to  a destination,  these  two  coding 
2^  subsystems  can  be  designed  entirely  independently.  One  can  always 
29  design  an  optimum  system  by  combining  (1)  a source  encoder  which 
■h)  has  been  designed  to  transform  (at  least,  approximately)  the  source 
output  into  a stream  of  independent  binary  digits,  each  equally 
62  likely  to  be  a 0 or  a 1,  and  (2)  a channel  encoder  which  has  been 

66  designed  quite  independently  of  the  actual  statistics  for  its  in- 
jj  put  binary  digits  (i.e.,  has  been  designed  for  use  with  a maximum- 
6.')  likelihood  decoder) . Fano^  has  aptly  commented  on  the  significeince 
6ii  of  this  fundamental  separability:  it  means  that  those  parts  of  ! 

67  ,the  communications  system  to  the  right  of  the  dashed  line  in  Fig.  1 
6,S  can  always  be  designed,  with  no  loss  of  optimality,  as  a system  to 
69  transmit  binary  digits  reliably.  Binary  digits  are  a kind  of 

L)  standard  interface  between  the  source  coding  world  and  the  channel 
! 1 coding  world,  and  one  pays  no  surt2uc  in  efficiency  for  crossing 
LJ  at  this  interface, 

i2 

t } As  characteristic  as  the  generality  of  the  above-stated  sepa- 

rability  result  of  Shannon  is  the  fact  that  his  1948  paper  gives  ' 
little  clue  as  to  how  complex  an  efficient  communications  system 
becomes  when  the  source  and  channel  coding  functions  are  separated 
_|_s  as  in  Fig.  1.  With  tongue-in-cheek,  we  now  assert:  ! 

i'-i 

60  Theorem  1;  For  a given  efficiency  (measured  in  number  of  source 
6 I .letters  transmitted  per  use  of  the  channel  and  fidelity  (measured 
’•2  in  the  quality  of  th^  source  reproduction  at  the  destination) 
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achievable  by  separate  source  and  channel  coding  for  a given 
source  and  a given  channel,  there  always  exists  a joint  source/ 

_1_1 channel  coding  scheme  for  the  same  source  and  channel  that  is  at 
' least  as  efficient,  that  gives  at  least  as  much  fidelity,  and  is 
'i_  no  more  complex  than  the  separate  coding  system. 

J^.J 

Proof:  Let  Fig.  1 be  a diagram  of  the  hypothesized  separate  sys- 
2^  'tem.  Then,  in  Fig.  1,  draw  a large  box  to  enclose  the  "source 
^J_L 'encoder”  and  "channel  encoder".  Draw  a second  such  box  to  enclose 
22  'the  "channel  decoder"  and  "source  decoder".  Call  the  first  new 
M box  the  "source/channel  encoder"  and  call  the  second  new  box  the 
I 2 I "source/channel  decoder",  you  have  just  constructed  a joint 
' source/channel  coding  system  that  satisfies  the  assertion  in  the 
^12  theorem.  (Naturally,  you  might  be  aWe  to  build  a simpler  joint  , 

! 7 I system  that  works  at  least  as  well;  in  fact,  you  might  be  able  to 
>12  build  a far  simpler  system!) 

-r-' 

Its  triviality  not  withstanding.  Theorem  1 does  illuminate 
2J.  .the  chief  attractive  feature  of  joint  source/channel  coding, 

‘.’J  namely,  the  possible  reduction  in  complexity  compared  to  a similar- 
^21  ly-performing  system  with  separate  source  and  channel  coding.  We 
2_i!will  pursue  this  point  further,  but  not  without  first  giving  a 
! caveat:  the  reduction  in  complexity  is  purchased  by  a loss  in 
db  iflexibility!  If  one  opts  for  a jointly  coded  system,  he  can  no 
2-1  longer  easily  adapt  his  system  later  to  a different  source;  in  the 
22  separately  designed  system,  one  could  continue  to  use  the  same 
j2ii  channel  coding  subsystem,  changing  only  the  source  encoder  to  the 
22  source  encoder  matched  to  the  new  source.  Telephone  companies 
21.  worldwide  are  beginning  to  experience  how  painful  this  loss  of 
22 i flexibility  can  be.  Most  telephone  systems  were  originally  design- 
22  ed  as  a joint  source/channel  coding  system  (even  if  the  designers 
:22iwere  unawares  that  they  were  doing  "coding")  for  transmitting  the 
:v2|Voice  source  over  a narrowband  channel.  As  more  and  more  of  their 
..■>  customers  are  changing  from  voice  sources  to  data  sources,  the 
2/1  i telephone  companies  are  madly  scrambling  to  adapt  their  communica- 
ISitions  brontosaurus  to  its  new  environment. 


'2.  DEFINITIONS  AND  PRELIMINARIES 

So  that  we  c;  " ’'•egin  to  speak  more  precisely  as  engineers 
I should,  we  state  a few  definitions.  ' 

I A binary  memoryless  source  (BMS)  with  parameter  g is  a device 
whose  output  is  a sequence  U^,  U^,  ...  of  statistically  inde- 
pendent, binary-valued  random  variables  such  that 


I P(U^  = 1)  = 1 - P(U^  = 0)  = q,  all  i. 

This  is  the  only  source  that  we  shall  cons.ider  hereafter;  it  is 
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•'  general  enough  for  all  our  purposes  even  if  it  is  a realistic 
_j_  model  of  only  few  actual  information  sources.  When  q = 1/2,  the 
5 'bmS  is  called  the  binary  symmetric  source  (BSS) ; this  very  special 
t ' type  of  BMS  will  play  a key  role  in  what  follows.  In  fact,  the  • 

7 jqoal  of  the  source/encoder  in  Pig.  1 is  to  make  its  output  a good 

8 i approximation  to  the  output  of  a BSS. 

jl\  I 

Jjl ; A binary  symmetric  channel  (BSC)  with  cross-over  probability 
is  memory less  channel  which  accepts  binary  digits  at  its  input 
12  I and  emits  binary  digits  at  its  output  according  to  the  following  ' 

1 1)  I conditional  probabilities:  ! 


\;inii  s ►2-— 


I .Hr  .<!  1 ex  i 


P(Y=l|x=0)=P(Y=0|x=l)=p 

H7i  ^ 

17  I P(y  = 1 I X = 1)  = P(Y  = 0 *7  X = 0)  = 1 - 


\uiliors’ .\tlilijtii  •>, >._L_ 


22.  Again,  although  the  BSC  is  a realistic  model  for  only  a few  actual 
22  discrete  channels,  it  is  general  enough  for  our  purposes. 

2.1 

2ji  I Next,  we  recall  some  well-known  results  from  information 
I'.'i  theoryi/2,3,4. 


Let  h(x)  = 


X log^  X - (1  - x)  log^  (1  - x)  (where  0 = x = 1) 


f-r  be  the  usual  binary  entropy  function.  Then  the 
of  the  BMS  is  given  by 


(or  "rate") 


H(U)  = h(q) 


bits/letter 


where  "letter"  means  a binary  digit  emitted  by  the  source.  Accord- 
ing  to  Shannon's  Noiseless  Coding  Theorem,  H(u)  is  the  lower  limit 
of  rate,  measured  in  encoded  binary  digits  per  source  letter,  for 

— a source  encoder  such  that  the  source  output  sequence  can  be  re- 
“ constructed  from  the  encoder  output  with  an  arbitrarily-small 
TTC 'specified  per-digit  error  probability.  Equivalently,  1/H(U)  is 

the  upper  limit  of  compression,  measured  in  source  letters  per 
—encoded  binary  digit,  which  can  be  achieved  by  coding  schemes 
which  convert  the  source  output  into  a stream  of  binary  digits 
~ jfrom  which  the  source  output  can  be  reconstructed  with  an  arbitra- 
-jTj  rily-small  specified  per-digit  error  probability. 

i The  capacity  of  the  BSC  is  given  by  ^ 

-pr  C = 1 - h(p)  bits/use, 

I ■) 

where  a "use"  means  the  transmission  of  a single  binary  digit 
jj-j-  through  the  channel.  According  to  Shannon's  Noisy  Coding  Theorem,  ' 
cj-  C is  the  upper  limit  of  the  rate  of  binary  digits  from  a BSS  (which 

- , we  can  think  of  as  being  the  output  of  the  source  encoder  in  Fig. 

1)  per  channel  use  for  a channel  encoder  such  that  there  is  a 
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channel  decoder  which  delivers  the  BSS  digits  with  an  arbitrarily- 
small  specified  per-digit  error  probability.  ) 

■1  ! 

I A very  fundamental  characterization  of  an  information  source 

jis  that  given  by  its  rate-distortion  function.  The  rate-distortion 
. I function  of  the  BMS  is  given  by 


I i-itic  - 

Winu-N  — 
i.f! M's  — 


:> 

()_ 

7 

8 
■■) 

'il 

18 

.11  where  D is  the  Hamming  distortion  defined  by 


< < 


R(D)  = 


h(q)  - h{D)  bits/letter,  0 = D = min(q,l-q) 


0, 


D>  min(q,l-q) 
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-P(0^  iJ.), 
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i.e.,  D is  the  per-digit  error  probability  in  the  source  recon- 
struction. According  to  Shannon's  Theorem  for  Coding  Relative 
to  a Fidelity  Criterion,  R(D)  is  the  lower  limit  of  rate,  measured 
in  binary  digits  per  source  letter,  for  a source  encoder  such  that 
the  source  output  sequence  can  be  reconstructed  from  the  encoder 
output  with  a distortion  of  D or  less. 
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LINEAR  CODING 


Jl.  I We  now  consider  the  special  case  of  linear  coding,  both  linear 

_k_ I source  coding  and  linear  chcinnel  coding.  We  begin  with  the  latter 
(because  the  relevant  theory^  is  more  widely  known.  I 

— I ' 

JL  ' A [block]  linear  (N,  K)  binary  channel  encoder  is  specified  [ 

iiiiby  a K X N binary  matrix  G,  of  rank  K,  in  the  manner  that 


r irsr  Line  ol'  Tith'  — 


X = V G 


JJ:  where  V = [V  , V , ...  V ] is  the  information  (row)  vector,  and 

1 i — 1 ^ 


Aiuti'-’rs’  Naiiu  ^ ^ 


.\(!c!")rs’  Al  . 


I'ir-.;  1.. 


Iti  i—  “ ^2'""  codeword.  The  operations  in  (1),  and  1 

1 ' I hereafter  for  all  matrices  and  vectors,  are  in  the  finite  field  ■ 
^28.;GF(2),  i.e.,  in  modulo-two  arithmetic.  The  code  rate  is  R = K/N 
JJl , bits/use.  ' 

20  ! 1 

7 — ! 2 3 

Z.L  ; It  is  well-known  ' that  linear  channel  coding  is  suf f icient- 

J^ily  general  to  attain  the  performance  promised  by  the  Noisy  Coding 
^ 2 1 Theorem  (although  we  hasten  to  add  that  it  is  only  the  encoder 
jLi  which  is  linear;  a good  channel  decoder  is  always  nonlinear!).  [ 
22.  That  is,  for  a given  e > 0 and  a given  R such  that  R < C,  there  i 
^lexists,  for  sufficiently  large  N,  linear  (N,  K)  encoders  and  ap- 
propriate  decoders  such  that  ! 


lii!  ^ P(X  X)  = e 1 

' I 

■>1  iwhen  this  channel  coding  system  is  used  on  a BSC  of  capacity  C,  ! 

32  (regardless  of  the  source  statistics.  In  fact,  it  is  known  that  no 
32  (other  type  of  coding  can  give  a significantly  smaller  decoding  I 
^lerror  probability.  Add  to  this  the  simplicity  with  which  a lineetr 
_3i|encoder  can  be  implemented  and  you  will  see  why  no  one  seriously  | 
^jproposes  the  use  of  other  than  linear  channel  encoders.  1 


3S j For  the  given  G,  one  can  always  find  an  (N-K)  x N matrix  H, 
39  of  rank  N-K,  such  that 

( I 

il' 

n i G = 0 (2) 


jLi;Where  the  superscript  T denotes  "transpose".  Moreover,  a given 
.j.L  vector  X is  a codeword  if  and  only  if 


T 

X H = 0. 


,If  one  writes  the  vector  Y = [Y  , Y_,...  Y,1  received  over  the  BSC 
'lo ; - 1 2 N 

— ; as  Y = X + E,  where  E = [E^,  E^,...  E^]  is  the  error  pattern,  then 
1 ,it  follows  from  (2)  that 


I’.i  K I "f 

Jill!  — y I ! 


At  T 

S = y '!  = E H . 
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The  (row)  vector  S = [S  , S , S 1 is  consequently  called  the 

““  i A N“K 

I syndrome  because  it  depends  only  on  the  errror  pattern  ^ that  has 
! infected  the  codeword  in  its  passage  through  the  BSC.  ! 

i i 

j It  is  a well-known  fact  in  coding  theory  that,  without  loss 

I of  optimality,  the  decoder  for  a linear  code  can  always  be  built 
■in  the  manner  shown  in  Fig.  3 such  that  the  decoder  first  forms 
! the  syndrome  and  then  estimates  the  error  pattern  solely  from  , 
j this  syndrome.  One  should  not  be  misled  by  Fig.  3;  the  leftmost 
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Fig.  3 A Syndrome  Decoder  for  a Linear  Code 


rightmost  boxes  therein  are  linear  devices  and  easy  to  implement, 
'but  the  box  labelled  "error  pattern  estimator"  may  be  unimaginably 
difficult  to  implement  for  very  long  and  powerful  codes. 

' We  now  turn  to  the  description  of  linear  source  coding.  A 

I [block]  linear  (N,  K)  source  encoder  is  specified  by  an  (N-K)  x N 
ibinary  matrix  H,  of  rank  N-K,  in  the  manner  that 

I y.  = £ 

where  U = [U, , U„ , . . . U„]  is  the  source  message,  and  ' 

V = (V  , V , ...  V ] is  the  encoded  version  of  the  source  message. 
““  X 2.  N”"K 

, (We  shall  place  the  subscript  c or  s on  K,  N,  H and  G whenever  the 
[context  docs  not  make  it  clear  whether  we  are  specifying  the  chan- 
nel encoder  or  the  source  encoder,  respectively.)  Thus,  the  com- 
pression ratio  of  a linear  (N,  K)  source  encoder  is 

0 = N/(N-K) . 


The  rate  of  this  linear  source  coding  scheme  is 
' = 1/0  = 1 - K/N. 
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The  reason  for  our  choosing  the  above  notation  for  linear 
source  encoding  is  the  interpretation  that  we  now  wish  to  make. 

I We  first  make  the  key  observation  that  the  error  pattern  E of  the 
BSC  is  statistically  identical  to  the  output  vector  ^ of  a BMS 
with  parameter  q equal  to  p.  Thus,  we  are  always  free  to  con- 
sider that  a linear  source  encoder  treats  the  ouiput  of  the  BMS  as 
an  "error  pattern"  amd  forms  the  "syndrome"  of  this  error  pattern, 
'according  to  (4) , which  syndrome  is  then  the  encoded  version  of 
,the  source  message.  Hence,  we  can  always  consider  linear  source 
! coding  conceptually  as  shown  in  Fig.  4 where  the  source  decoder 
iis  an  "error  pattern  estimator".  This  interpretation  of  linear 
I source  coding  appeared  first  in  the  literature  in  the  work  of 
^Ohnsorge^  and  has  been  rather  fully  developed  by  Ancheta'^. 


ly 
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•) 

Fig.  4 The  Syndrome-Source-Coding  Interpretation  of  Linear 
“ Source  Coding 
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's 

-'ll  A.  JOINT  LINEAR  SOURCE/CHANNEL  CODING — THE  DISTORTIONLESS  CASE 

.in 

— We  now  consider  linear  source  encoding  when  the  goal  is  repro- 

duction  of  the  source  with  a negligibly  small  (but  non-zero)  proba- 
^ bility  e of  digit  error,  so-called  "distortionless  coding". 

.■>  T 

Consider  a BtlS  with  parameter  q where,  for  convenience  with 
no  real  loss  of  generality,  we  take  0 ^ q ^ 1/2.  For  the  BSC  with 
.2 crossover  probability  p equal  to  q,  we  know  there  is  a linear  chan- 

nel  coding  scheme  (G  , H ) such  that,  for  any  given  6 > 0,  it  has 

; ; '■)  c c 

R=C-6  = 1-  h(q)  - 6 

I 

and  achieves  per-digit  error  probability  e or  less  in  the  estimated 

\ ■) 

.14  codeword  ]£  = U.  G . For  this  channel  coding  scheme,  the  per-digit 

■ 1 C ys 

I - 

error  probability  in  the  vector  E of  Fig.  3 coincides  with  that  in 

•;  the  vector  X.  Thus,  if  we  use  these  same  two  matrices  as  the  G 
— ■ — s 

—2.  euid  H of  the  source  coding  scheme  of  Fig.  4,  it  follows  that  the 

I * • S /V 

-,ii  per-digit  error  probability  of  the  reconstruction  is  again  the 
I Scime,  i.e.,  is  or  less.  (Here  we  assume  that  the  source  coding 
' scheme  uses  the  same  error  pattern  estimator  as  did  the  channel 
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which  is  arbitrarily  close  to  the  upper  limit  of  achievable  com- 
pression ratios,  1/H(U),  established  by  the  Noiseless  Coding 
Theorem.  Thus,  as  has  been  observed  by  Heilman®  and  Ancheta^, 
linear  source  encoding  entails  no  loss  of  optimality  when  the 
goal  is  dist jrtionless  reproduction  of  the  source. 

' But  we  now  recall  that  linear  channel  coding  never  eiilails  a 

I loss  of  optimality.  Moreover,  if  we  have 


N - K = K 
s s c 

(which  can  always  be  achieved  simply  by  redefining  the  block 
lengths,  if  necessary,  to  be  integer  multiples  of  the  original 
block  lengths),  then  we  can  write  for  the  tandem  co.nbi nation  of 
the  two  linear  systems 

X = VG  = Uh'^G. 

— — c —sc 

A T 

It  follows  then  that  we  can  consider  A = G^  to  be  the  defining 
matrix  of  a linear  joint  source/channel  encoder  which  operates  as 
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X = U A. 
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It  follows,  as  first  observed  by  Heilman  , that  joint  linear 
source/channel  encoding  entails  no  loss  of  optimality  when  the 
goal  is  distortionless  reproduction  of  the  source.  Moreover,  the 
implementation  of  the  matrix  A = H-^'  G cannot  avoid  being  far  sim- 
pler in  general  that  the  separate  implementation  of  the  matrices 

hT  and  G . 
s c 

Example ; Suppose  that  we  are  to  transmit,  with  negligibly  small 
distortion,  a EM3  with  q = .10  through  a BSC  with  p = .10.  Since 
h(.lO)  = 0.47,  it  follows  that  a compression  ratio  of  l/h(.10)  = 
2.13  can  he  approached,  and  that  a channel  coding  rate  of  C = 

1 - h(.lO)  = .53  can  be  approached.  Thus,  an  overall  efficiency 
of  (2 . 13) x (0. 53)  = 1.13  source  letters  per  channel  use  can  be  ap- 
proached arbitrarily  closely  with  joint  source/channel  linear  cod- 
ing, and  no  larger  overall  efficiency  ceun  be  obtained  by  any  dis- 
tortionless coding  scheme.  In  particular,  for  suitably  large  K, 
we  can  find  an  R = 1/2  linear  channel  encoder  specified  by 
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. (where  P is  some  K x K binary  matrix)  and  a B = 2 linear  source 
encoder 

* 

? I "s  ■ 

_^jsuch  that  the  overall  distortion  is  smaller  than  the  specified 
9 I small  amount.  But  then 

1 0 I 

r "5  n 


A = H G = 
s c 


-j-r 'describes  a linear  joint  source/channel  encoder  which  has  overall  , 
-j-n- jefficiency  BR  = 1/  quite  close  to  the  theoretical  limit.  More-  ( 
uthors  Xainc-s — v-prlover,  we  see  that:  A— CcaT~be  implemented  quite  straightforwardly '■ 


Allii'i.uiui 


Yj^lfrom  a device  which  implements  only  P,  whereas  implementation  of 
iG  and  H would  each  require  implementation  of  P in  separate  source 
TTj^ 'and  channel  coding.  It  is  interesting  to  note  that  A is  an  N x N 
'matrix,  but  that  its  rank  is  only  N/2;  this  lack  of  full  rank  1 
^ jappears  to  be  fundamental  for  useful  linear  joint  source/channel  | 
T,-T  encoders . 

^ 'M  < 

■^j  We  conclude  that  joint  linear  source/channel  coding  is  a 

•7,~  'highly  attractive  approach  when  the  goal  is  the  distortionless 

^reproduction  of  the  source . j 

“5.  JOINT  LINEAR  SOURCE/CHANNEL  CODING — THE  NON-NEGLIGIBLE 

DISTORTION  CASE  | 

! i 

— with  many  actual  data  sources  (e.g.,  with  facsijaile) , one  is' 
--T  often  content  to  accept  non-negligible  distortion  D in  the  source  | 

reproduction  (e.g.,  D = 1/10).  The  rate-distortion  function  of  | 

"irr  the  source  specifies  how  such  a relaxed  demand  on  the  fidelity  of  l 

— reconstruction  can  be  translated  into  more  efficient  use  of  the  * 
-l-r  channel,  i.e.,  fewer  uses  of  the  channel  for  each  source  letter.  ! 

'■is  'i  9 ^ 

Following  recent  work  by  Ancheta  , we  now  show  that,  for  a 1 

given  D (non-negligibly)  greater  than  zero, -the  performance  of  ( 

— linear  source  coding  is  bounded  in  general  strictly  below  the  com- 
— — pression  ratio  1/R(D)  which  Shannon  has  shown  can  be  approached 

— arbitrarily  closely  by  some  sort  of  source  coding.  i 

TT : ! 

-pr  I The  key  (and  clever)  idea  in  Ancheta 's  proof  that  linear  i 

— source  encoding  for  non-negligible  distortion  in  inherently  sub- 

-rr  optimal  was  his  exploitation  of  the  fact  that  a linear  source  en- ' 

-'—coder  "cannot  see"  a vector  which  lies  in  the  null  space  of  the 

-.—•matrix  , i.e.,  its  output  is  zero  for  any  vector  which  could  be 
I * ' s 

— -j-  the  output  of  the  linear  device  wnich  implements  the  matrix  G . 

Consider  then  the  situation  shown  in  Fig.  5,  where  we  have  merely 
-T 'supplemented  the  source  coding  system  of  Fig.  4 by  adding  some 
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devices  that  have  no  effect  on  the  latter's  operation.  If  D is 
the  por-digit 
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error  probability  in  U for  the  linear  source  coding  scheme,  we 
■77  see  from  Fig.  5 that  it  is  also  the  per-digit  error  probability  in 

/s 

r/-'  X'.  Now,  as  is  well-known  in  coding  theory,  given  H , one  can 

. -J  always  choose  G such  that  G has  an  identity  matrix^in  some  K of 
•>\  s ^ s 

■7ri''its  columns.  But  then  V'  is  just  the  vector  composed  of  the  K 

T)~  I ^ 

_ii>  digits  in  these  K positions  of  X'.  It  follows  that  the  per-digit 

'll  " 

,7  error  probability  in  V'  is  at  most  (N/K)D.  But,  since  this  is 
777 'also  the  fidelity  with  which  the  BSS  (not  the  BMS!)  in  Figure  5 is 
■Tj7  being  transmitted  through  the  BSC  created  by  considering  the  out- 
put  of  the  BMS  to  be  an  error  pattern  and  since  K digits  of  the 
■777  BSS  are  being  transmitted  with  N uses  of  this  BSC  with  capacity 
77^  C = 1 - h(q),  it  follows  from  the  properties  of  the  rate-distortion 
Trrifunction  of  the  BSS  that 


N[1  - h(q)]  > „ ,N  , , ,N 

- \SS  <K  ^ 


or,  equivalently, 

f"  N > M 

7,  : h(g  D)  = 1 - I [1  - h(q)]. 

j ,,  V^e  can  put  (5)  into  more  revealing  form  in  terms  of 

' R = i = 1 - K 
'■  \ 3 N- 

pi 

■7-  Then  (5)  becomes 
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)7  ;a  linear  source  cooing^ystem  must  Ise  when  a non-negligibleT)  fs 

\iulv)i5'  \rrili  iiiMii  Ts  |specified.  For  example,  with  D = .11,  R(D)  = .50  but  = .78. 

JJllThus  the  linear  scheme  can  have  at  best  B = 1/R  = 1.28,  compared 

20  I L 

21  |to  the  compression  ratio  1/R(D)  = 2 that  can  be  approached  by  more 
"^.general  source  coding  schemes. 

‘ ‘ '''  *' JT  A similar  interpretation  can  be  made  from  Fig.  6b  where  we 

2;,  ’have  shown  the  rate-distortion  function  R(D)  for  the  general  BMS 
oTjiand  also  the  corresponding  bound  on  R^^  from  (6). 


-.A3_r j 

> /■ . where  h is  the  inverse  (made  unique  by  restricting  its  values 

to  be  between  0 and  1/2)  of  the  binary  entropy  function. 

I 

I The  significance  of  (6)  can  perhaps  be  most  easily  seen  by 

I its  specialization  to  the  BSS,  i.e.,  to  q = 1/2.  Then  h(q)  = 1 
JLiand  (6)  simplifies  to  ; 

JLl 

111!  D = (1  - R )/2.  (7) 

1 I I ^ I 

I f>  I , - 

-u=.  .In  Fig.  6a,  we  have  plotted  both  the  bound  (7)  on  the  attainable 
distortion  D of  a linear  source  coding  scheme  of  rate  R^^  for  the  ^ 

■73  BSS,  together  with  the  rate-distortion  function  R(D)  = 1 - h(D)  of 
the  BSS.  This  figure  clearly  illustrates  how  far  away  from  optimal 
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Fig.  6 Bounds  on  the  Achievablt  Rate  Rj^  with  Linear  Source  Coding 

jjT’  9 

H I Ancheta  actually  has  a lot  more  to  sa”  about  the  non-opti- 
"julmality  of  linear  source  coding  with  non-negtible  distortion,  but 
^Iwe  shall  leave  the  rest  for  him  to  tell  in  his  own  publications,  ■ 
except  to  mention  his  conjecture  that  the  achievable  R^^  is  actual- 

ly  more  strictly  bounded  away  from  R(D)  according  to  the  dashed 
line  shown  in  Fig.  6b. 

il, 

Ik'  ' 

— I We  now  give  a simple  argument  to  show  that  the  inherent  lack 

4^  I of  optimality  of  linear  source  coding  in  the  non-negligible  dis- 
-I^jtortion  case  implies  in  general  an  inherent  lack  of  optimality  for 

21 1 
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linear  joint  source/channel  encoding  in  the  non-negligible  dis- 




tortion  case. 


I Suppose  that  the  N x N matrix  A describes  a linear  joint 

! SC  ^ 

isource/channel  encoder,  for  a BMS  and  BSC,  which  achieves  distor- 
tion D (where  D is  not  negligibly  small).  Suppose  that  A has 
.rank  r.  Then  one  can  always  find  an  r x N matrix  H of  rank  r 
and  r X N matrix  G of  rank  r such  that  A^=  G . ^Thus,  v;e  can 


jconsider  ?he  matrix^H"^  as  describing  a linear  iource  encoder  and 
(the  matrix  G as  describing  a linear  channel  encoder;  the  original 
j linear  joint  source/channel  encoder  is  equivalent  to  separate  en- 
coding with  these  derived  linear  encoders. 


s’ .\cirn>r>  — I Let  D'_be_the-best-obtainable— distortion-when-the  BMS  is  re- 
' - ^constructed  directly  from  the  output  of  the  linear  source  encoder 
Eiii.'.  ■ :i  • .» -J  It  follows  that  D'  > D,  because  the  best  service  which  the 

i ' channel  encoder  G can  provide  is  to  permit  perfect  transmission 
of  the  source  encoder  output  to  the  best  source  reconstructor. 
Hence,  the  rate  R of  the  linear  source  encoder  must  satisfy  (6) 
for  the  given  distortion  D. 

— The  overall  efficiency  of  the  linear  separate  source/channel 
-i  coding  system  (and  hence  also  of  the  entirely  equivalent  original 
linear  joint  coding  system)  is  6 letters 

per  channel  use,  where  the  inequality  follows  from  the  fact  that 
~ R Of'  the  other  hand,  there  exist  coding  systems  whose  ov'er- 

,;i  afl  efficiency  approaches  C/R(D)  source  letters  per  channel  use, 

: ’ where  C is  the  capacity  of  the  BSC  and  R(D)  is  the  rate-distortion 
“ j function  of  the  BMS.  Thus,  when,  for  a given  D,  the  bound  (6) 

■;  , Ispecifies  an  R such  that  R^^  > R(D)/C,  then  there  is  an  inherent 
i .loss  of  optimality  when  linear  joint  source/channel  encoding  is 
used.  In  other  words,  when  the  bound  (6)  gives  an  R^^  which  exceeds 
R(D)  by  a factor  of  more  than  1/C,  then  linear  joint  source/channel 
, , encoding  is  sub-optimum. 

'i'7  Example;  Consider  the  BSS  together  with  the  BSC  having  p = .10, 

"Tm  and  suppose  that  D = 1/4  is  specified.  Then,  R(D)  = h(l/2)-h(l/4)= 
!7!.19.  From  (6),  we  find  R^^  = .50.  Thus,  R^  is  (.50)/(.19)  = 2.63 

* ' times  as  great  as  R(D) . But  1/C  = 1.89.  Because  2.63  > 1.89,  it 

-j-i  follows  that  a linear  joint  source/channel  coding  system  must  be 

' sub-optimum.  To  put  it  another  way,  any  such  linear  joint  coding 

• system  has  an  efficiency  of  at  most  = 2,  whereas  there  exist 

-'■1  more  general  coding  systems  whose  efficiency  approaches  C/R(D)  = 

-11-2.79  source  letters  per  channel  use. 
is  ^ 

'-i  We  should  point  out  in  closing  that  a joint  linear  source/ 

''channel  coding  system  can  sometimes  "accidently"  be  optimal  when 

''  R , as  given  by  (6),  exceeds  R(D)  by  a factor  of  only  1/C  or  less. 

\ > L 
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In  the  above  example,  if  we  had  taken  D = .10  rather  than  D = 1/4, 
_^,we  would  have  found  = .80  and  R(D)  = .53  so  that  Rj^/R(D)  = : 

■(Pjl.51  < 1/C  = 1.89.  C/R(D)=1  is  the  maximum  approachable  efficency. 
.But  the  "straight  wire"  encoder,  which  merely  transmits  the  BSS  j 
output  directly  over  the  chcuinel,  has  efficiency  1 and  distortion 
|D  = .10.  We  can  consider  this  trivial  but  optimum  coding  scheme  i 
YJJ  jas  the  linear  joint  source/ch^mnel  coding  scheme  with  A = 1.  I 

yf  I [The  reason  for  this  accidental  optimality  is  that  the  given  BSC  ! 

jhappens  to  be  the  appropriate  "forward  channel"  for  the  given  ! 
Yj  distortion  D and  the  BSS,  cf.  Berger'*]  j 

TT  i 
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